74 research outputs found

    Logic-Statistic Models with Constraints for Biological Sequence Analysis

    Get PDF

    Bayesian Annotation Networks for Complex Sequence Analysis

    Get PDF
    Probabilistic models that associate annotations to sequential data are widely used in computational biology and a range of other applications. Models integrating with logic programs provide, furthermore, for sophistication and generality, at the cost of potentially very high computational complexity. A methodology is proposed for modularization of such models into sub-models, each representing a particular interpretation of the input data to be analysed. Their composition forms, in a natural way, a Bayesian network, and we show how standard methods for prediction and training can be adapted for such composite models in an iterative way, obtaining reasonable complexity results. Our methodology can be implemented using the probabilistic-logic PRISM system, developed by Sato et al, in a way that allows for practical applications

    Inference with Constrained Hidden Markov Models in PRISM

    Full text link
    A Hidden Markov Model (HMM) is a common statistical model which is widely used for analysis of biological sequence data and other sequential phenomena. In the present paper we show how HMMs can be extended with side-constraints and present constraint solving techniques for efficient inference. Defining HMMs with side-constraints in Constraint Logic Programming have advantages in terms of more compact expression and pruning opportunities during inference. We present a PRISM-based framework for extending HMMs with side-constraints and show how well-known constraints such as cardinality and all different are integrated. We experimentally validate our approach on the biologically motivated problem of global pairwise alignment

    Efficient tabling of structured data with enhanced hash-consing

    Get PDF
    Current tabling systems suffer from an increase in space complexity, time complexity or both when dealing with sequences due to the use of data structures for tabled subgoals and answers and the need to copy terms into and from the table area. This symptom can be seen in not only B-Prolog, which uses hash tables, but also systems that use tries such as XSB and YAP. In this paper, we apply hash-consing to tabling structured data in B-Prolog. While hash-consing can reduce the space consumption when sharing is effective, it does not change the time complexity. We enhance hash-consing with two techniques, called input sharing and hash code memoization, for reducing the time complexity by avoiding computing hash codes for certain terms. The improved system is able to eliminate the extra linear factor in the old system for processing sequences, thus significantly enhancing the scalability of applications such as language parsing and bio-sequence analysis applications. We confirm this improvement with experimental results.Comment: 16 pages; TPLP, 201

    Common variants in LEPR, IL6, AMD1, and NAMPT do not associate with risk of juvenile and childhood obesity in Danes:a case-control study

    Get PDF
    BACKGROUND: Childhood obesity is a highly heritable disorder, for which the underlying genetic architecture is largely unknown. Four common variants involved in inflammatory-adipokine triggering (IL6 rs2069845, LEPR rs1137100, NAMPT rs3801266, and AMD1 rs2796749) have recently been associated with obesity and related traits in Indian children. The current study aimed to examine the effect of these variants on risk of childhood/juvenile onset obesity and on obesity-related quantitative traits in two Danish cohorts. METHODS: Genotype information was obtained for 1461 young Caucasian men from the Genetics of Overweight Young Adults (GOYA) study (overweight/obese: 739 and normal weight: 722) and the Danish Childhood Obesity Biobank (TDCOB; overweight/obese: 1022 and normal weight: 650). Overweight/obesity was defined as having a body mass index (BMI) ≥25 kg/m(2); among children and youths, this cut-off was defined using age and sex-specific cut-offs corresponding to an adult body mass index ≥25 kg/m(2). Risk of obesity was assessed using a logistic regression model whereas obesity-related quantitative measures were analyzed using a general linear model (based on z-scores) stratifying on the case status and adjusting for age and gender. Meta-analyses were performed using the fixed effects model. RESULTS: No statistically significant association with childhood/juvenile obesity was found for any of the four gene variants among the individual or combined analyses (rs2069845 OR: 0.94 CI: 0.85–1.04; rs1137100 OR: 1.01 CI: 0.90–1.14; rs3801266: 0.96 CI: 0.84–1.10; rs2796749 OR: 1.02 CI: 0.90–1.15; p > 0.05). However, among normal weight children and juvenile men, the LEPR rs1137100 A-allele significantly associated with lower BMI (β = −0.12, p = 0.0026). CONCLUSIONS: The IL6, LEPR, NAMPT, and AMD1 gene variants previously found to associate among Indian children did not associate with risk of obesity or obesity-related quantitative measures among Caucasian children and juvenile men from Denmark

    Increased frequency of rare missense <i>PPP1R3B</i> variants among Danish patients with type 2 diabetes

    Get PDF
    <div><p>Background</p><p><i>PPP1R3B</i> has been suggested as a candidate gene for monogenic forms of diabetes as well as type 2 diabetes (T2D) due to its association with glycaemic trait and its biological role in glycogen synthesis.</p><p>Objectives</p><p>To study if rare missense variants in <i>PPP1R3B</i> increase the risk of maturity onset diabetes of the young (MODY), T2D or affect measures of glucose metabolism.</p><p>Method</p><p>Targeted resequencing of <i>PPP1R3B</i> was performed in 8,710 samples; MODY patients with unknown etiology (<i>n</i> = 54), newly diagnosed patients with T2D (<i>n</i> = 2,930) and population-based control individuals (<i>n</i> = 5,726, of whom <i>n</i> = 4,569 had normal glucose tolerance). All population-based sampled individuals were examined using an oral glucose tolerance test.</p><p>Results</p><p>Among <i>n</i> = 396 carriers, we identified twenty-three <i>PPP1R3B</i> missense mutations, none of which segregated with MODY. The burden of likely deleterious <i>PPP1R3B</i> variants was significantly increased with a total of 17 carriers among patients with T2D (0.58% (95% CI: 0.36–0.93)) compared to 18 carriers among non-diabetic individuals (0.31% (95% CI: 0.20–0.49)), resulting in an increased risk of T2D (OR (95% CI) = 2.57 (1.14–5.79), <i>p</i> = 0.02 (age and sex adjusted)). Furthermore, carriers with diabetes had less abdominal fat and a higher serum concentration of LDL-cholesterol compared to patients with T2D without rare missense <i>PPP1R3B</i> variants. In addition, non-diabetic carriers had a higher birth weight compared to non-carriers.</p><p>Conclusion</p><p>Rare missense <i>PPP1R3B</i> variants may predispose to T2D.</p></div
    corecore